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ABSTRACT 

The employment of judgment as an evaluation tool at 
the early stage of curriculum development is advantageous. The 
reasons why this is so are: (1) writers exhibit more readiness to 
change elements of the program during the early stages of curriculum 
development; (2) there is great economic advantage in using such an 
evaluation tool; and (3) utilization of expert judgment may decrease 
the time needed for program development. . The limitations to the 
utilization of judgmental data are related to what to judge, 
specifications and outcomes, specifications and planned activities, 
planned and actual activities, and actual activities and outcomes. . 
The present paper does not contain a detailed list of questions that 
may be presented to experts, but rather concentrates on indicating 
several aspects of the program that can be judged. Four major aspects 
of the fit of planned activities to the program specifications are 
content, presentation, learning apparatus, and relatedness to stated 
objectives. Prediction of future events relate to actual activities, 
student interest, difficulty level, teachers" reaction, and community 
reaction. The selection of experts for judging the fit between the 
program specifications and the planned learning activities should be 
made on the basis of logical considerations. Three methods of 
examining the validity of judgmental data are: (1) to quantify the 
degree of consensus obtained among experts with regard to different 
issues; (2) the employment of the "shuffle test"; and (3) by 
experimental validation. (DB) 
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The collection and the analysis of empirical data with the intention of 
providing suggestions for the modification of curriculum material has been 
termed '^formative evaluation" (Scriven, 1967}. The most frequently used data, 
in such contexts, are responses to specially devised tests, and records of 
structured and unstructured class observations. The possibility and the limits 
of using experts' judgments for the sake of formative evaluation has not yet 
been systematically explored, though data of this type is frequently used in 
the process of curriculum evaluation. 

Experts in curriculum evaluation have indicated the importance of judg- 
mental data. Stake (1967) included "judgment" as a genuine part of a data 
matrix, which constitutes the basis for the inference about curriculum material. 
According to Fox (1971) the expert judgment uttered within the framework of the 
deliberations of the curriculum team (and refined and revised in face of oppo- 
sing and/or supporting judgments of other experts) is the most valid evalua- 
tion strategy. Fenton suggested a series of questions which could be presented 
to experts, relying on ''consensus among experts as the test of the validity of 
expert judgments" (Fenton, 1973), 

The importance of judgmental data is indicated not only in theoretical 
papers and in models; practitioners also report the use- of such data in actual 
work of curriculum evaluation (Flanagan & Jung 1970; Lewy, 1971)- Therefore, 
it is rather surprising that only a few reports are available describing the 
actual utilization of judgmental data in curriculum evaluation (see, for in- 
stance. Peri, 1973), and that no attempt has been made to develop a methodolo- 
gical framework which may serve as a guideline for colTection and analysis of 
data of this type. 



The Place of Judgmental Data 1n Formative Curriculum Evaluation 



Formative curriculum evaluation is not a one shot action; it is rather 
a sequence of activities which runs simultaneously with the curriculum devel- 
opment process from its very beginning. The evaluator, who is responsible 
for detecting v^^eaknesses in the curriculum material, has to look for flaws 
as soon as they are detectable. Some flaws will emeT^e only after the cur- 
riculum is put to use in a number of schools, and these will remain undetected 
through the process of judgment. But experimental tryout can take place only 
when the material has already been developed to such a standard that school 
systems are willing to use it, while judgment of the material can be done at 
earlier stages of curriculum development. Thus, as soon as some portions of 
the curriculum material have been developed in first draft form, it is possi- 
ble to critically review it. Experts may examine the preliminary draft of a 
single expisode, exercise, or other learning activity even before their se- 
quential order in a course of studies has been determined. It is not implied 
that such a critical review alone is sufficient for evaluati ng ..the quality of 
the study material, but it is our contention that this method helps to detect 
flaws in the material at an early stage of the program development, when no 
other method can be used for this purpose. 

The employment of judgment as an evaluation tool at the early stage of 
curriculum development is advantageous for several reasons. First, writers 
exhibit greater readiness to change elements of the program during the early 
stages of curriculum development than later after investing much work and 
energy in planning and in writing. Gradually writers become apologetic about 
their work and are inclined to justify what they have produced, rather than 
to accept criticism and re-write portions of the program. Secondly, there is 
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great economic advantage in using such an evaluation tool. An expert's judg- 
ment is relatively uncostly in terms of labor and^^time. Also, in most cases, 
it is easier to manipulate the time schedule of experts than the time schedule 
of students. Frequently a certain unit can be tried out only when students 
have reached a definite point in their study; the tryout of new. study material 
is then restricted by the sequence of activities and other conditions in the 
would-be experimental classes. Thirdly, utilization of expert judgment may de- 
crease the time needed for program development* This is especially important 
in educational systems where alternative study material is not available. In 
such cases one has to consider the loss caused to students by the fact that 
they have to use an obsolete program while waiting for publication of the new 
material . 

Limits of Using Judgmental Data 

The utilization of judgmental data for the sake of curriculum evaluation 
may well be subject to some limitations. There may be situations or circum- 
stances where such data possess a high level of voV'dity, while in other situa«" 
tions they may be less valid. There is a need for a series of studies which 
will help to specify these limitations and suggest conditions under which the 
utilization of such data is advisable. While there is a need to increase our 
knowledge concerning several aspects of the evaluation process, the present 
paper will be primarily addressed to two such possible limitations: the 
"what" and the "who" questions. 

What to Judge 

f 

The judgment of curriculum material prior \k its tryout in classes may 

i 

have two major concerns: first, the fit of the material to a set of specifi- 
cation'", and second, the prediction of the responses or reactions which the 



material will elicit. 

Figure 1 represents four elements related to the process of planning and 
implementing learning activities. The left column elements represent the 
written curriculum and the right column elements the implemented curriculum. 
The connecting lines indicate areas of evaluation activities. The dotted 
lines and the inner quadrangle words represent activities which can be performed 
at pre-tryout stage; the solid lines together and the outer quadrangle words 
represent activities which are typically performed at the tryout stage. Four 
investigation areas are defined and described below. 

Specifications and Outcomes . The relationship between specifications and 
outcomes is the major topic of most evaluation studies. The instruments com- 
monly used in this context are achievement tests of different types. In Figure 
1 a solid line appears between these two elements indicating that this type of 
information cannot be collected before the experimental tryout of the curricu- 
lum material . 

S pecifications and Planned Activities . These two elements of the scheme 
are connected with dotted lines, suggesting that relationship between pro- 
gram specifications and planned activities can tul ly be studied at the pre-try- 
out stage through judgmental procedures, and results obtained at this stage 
need not be studied again at the tryout stage. 

Planned and Actual Activities . It might be useful to design pre-tryout 
studies with the aim to predict the degree of the implementation of the plan- 
ned program. If experts predict that the program will not be ivnplemented and 
they suppose that the actual activities will substantially differ from the 
planned activities* then the developers will be in a position to modify the 
program in a way that will increase the probability of its full implementation. 
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In this area both predictive and empirical -observational studies are sug- 
gested. This is represented in Figure 1 by both a solid and a dotted line 
which connect these two elements in the scheme. One has to note, however, that 
in- the normal course of curriculum development these two types of studies will 
relate to different sets of planned activities. The preliminary draft. of a 
course will first be submitted to judgment, and on the basis of experts' opinion 
it will be modified. This modified version of the planned activities, and not 
the original one, will subsequently be put to experimental tryout and to an em- 
pirical implementation study. 

Actual Activities and Outcomes . The relationship between the actual acti- 
vities and outcomes should be empiricaVly studied on the basis of observational 
and achievement data. These empirical studies are parallel to the logical in- 
vestigation of the fit between specifications and planned learning activities. 
Studies of this type have been reported by Rosenshine (1971), 

Thus, two areas are indicated where judgmental evaluation may be of value: 
one is the fit between specifications and planned activities; the other is the 
congruence between planned and actual activities. 

Inventory of Questions 

Several attempts have been made to prepare a list of questions which should 
be answered by experts with the aim to evaluate educational programs (Fenton, 
1973). In Formative Curriculum Evaluation: A Manual of Procedures , Weiss, 
Edwards, and Dimitri (1971) compiled an extensive list of questions referring 
to various aspects of the curriculum, and suggested that evaluation experts should 
select questions which correspond to their interest. The present paper does not 
contain a detailed list of questions which may be preseiv^:ed to experts; it rather 
concentrates on indicating several aspects of the program which can be judged. 



The Fit of Planned Activities to the Program Specifications 

The questions listed here refer to specifications explicitly stated by 
the program writers and to those which are only implicit. The latter area 
includes such program specifications as presenting only scientifically valid 
facts and gramatically correct sentences. One may question the necessity of 
judging the scientific accuracy of curriculum material but a mere glance at 
some widely used textbooks in various subjects will convince the reader of 
the necessity for such judgment. Four major aspects of the f 1 t and the con- 
siderations in each are listed below. 

CONTENT " scientific accuracy 

significance of topics 

omission of issues of major importance 

fair representation of different views 

PRESENTATION correctness of language 

aesthetic value of illustrations 

LEARNING APPARATUS provision for individual differences 

cl ari ty of cues 
variety of cues 

variety of learning activities 

RELATEDNESS TO 

STATED OBJECTIVES opportunity to learn what the program intends to teach 

Prediction of Future Events 

Judgments concerning the implication of the program and concerning the 
outcomes of the program in terms of students' achievements, teachers* reaction 
and community reaction constitute prediction of future events. Validity of 
such judgrnents can empirically be studied by later observing the process about 
which prediction has been made, while the validity of judgn^ents of the previous 
type does not depend on events occuring in the future. 
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ACTUAL ACTIVITIES — will the program be used properly 

STUDENT INTEREST — will students be interested in the material 

DIFFICULTY LEVEL will the students encounter special difficulties 

in using the program — does the text have an 
appropriate level of difficulty 

TEACHERS' REACTION -- will teachers enjoy teaching the material _ 

--will teachers properly understand the material 

COMMUNITY REACTION -- will people of the community find any of the 

material offensive, and interfere with the im- 
plementation of the program 

Who Should Judge? 

One should select experts in a way which assures maximum validity of their 
judgments. This implies that first one has to sort the questions according to 
the types of expertise needed to answer to them, and then one has to make a de- 
cision with regard to the individuals to be solicited. To judge the scientific 
accuracy of the material one will invite subject specialists. If within a sin- 
gle subject there are several alternative points of view, one should call upon 
a subject expert who identifies himself with the point of view which is reflected 
in the -.urriculum material. Also, if in a course in social studies conflicting 
opinions about a controversial social issue are presented, each of the different 
opinions should be examined by a person who is a prominent representative of 
that particular view, and each one should indicate whether the material properly 
represents that particular view. 

The selection of experts for judging the fit between the program specifica- 
tions and the planned learning activities should be made on the basis of logical 
considerations. As a rule of 'thumb it is suggested that experts should be con- 
sulted whose competency is recognized by the curriculum producers and by those 
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who .will assume responsibility of making decisions regarding the use of the 
program. The selection of experts to predict events, to forsee actual learning 
experiences and outcomes, can be facilitated by systematic empirical research. 
Studies can be de 'ise.d to explore the validity of predictions of experts of dif- 
ferent types. By comparing a posteriori events with predicted events one may • 
generalize about the ability of teachers, school administrators, school psycholo- 
gists and educational psychologists to predict events of different types. Much 
can be learned from systematic studies of this type about proper preferences in 
selecting experts to predict future events. Studies of this type succeeded to 
provide useful techniques for improving the validity of prediction (Helmer, 1967). 

The Value of Judgmental Data 

The utilization of judgmental data is justified only if it detects flaws, 
the correction of which increases the quality of the curriculum material and 
improves the outcomes of learning. In other words, one should ask whether judg- 
mental data are likely to yield valid conclusions. Three methods of examining 
the validity of judgmental data will be indicated below. They are listed in in- 
creasing order of strength to support the validity of conclusions derived from 
judgmental data. 

Agreement Among Experts 

One way to examine the validity of judgmental data is to quantify the degree 
of consensus obtained among experts with regard to different issues* Consensus 
among experts is usually considered as an indicator of reliability. If one uses 
the term validity in a sense of "well grounded, justifiable, true in terms of lo- 
gistic system to which the inference belongs", at least with regards to some 



questions, one may consider the consensus among experts as support of validity. 
Thus, for instance, the correctness of a statement, considering the present state 
of facts within the framework of a discipline, can be supported and proved by the 
consensus of competent experts. 

It should be noted, however, that expert judgment should serve as a basis 
for the modification of curriculum material only if there is a considerable amount 
of consensus between experts with regard to certain issues. A mere majority of 
opinions should not necessarily require modification of the curriculum material. 
If a minority of competent experts have different opinions from those of the 
majority, the curriculum writers may be ju'->tified in following the opinion of the 
minority. Only an overwhelming majority cf opinions should demand action. 

The Shuffle Test 

Another method of validating the conclusions based on judgmental data may 
be the employment of "the shuffle test." Using this approach, the curriculum 
team modifies the program according to suggestions emerging from the judgmental 
data; the two versions of the curriculum material, the original and the modi- 
fied, are then presented to another team of experts who select the more appro- 
priate one. It is assumed that if the^modifi cations improved the quality of 
the program, the experts will prefer the modif ied,;version to the 'original one. 



Preference for the revised version can serve 



as evidence of the contri- 



bution of expert judgment to the improvement of tl^e curriculum material. Of 
course, the "shuffle test" can be relied on only i\ the modifications were satis- 
factorily carried out in line with the suggestions. 

Experimental Validation 

The most conclusive method of validating judgmental data involves experi- 
Q mentally validating the materials in operation. In order to conduct such an 
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experiment, the developers must maintain two versions of the material: one 

would be modified according to experts' judgment; the other v/ould use only 

the Internal review of curriculum. These two programs are then implemented 

in a random sample of classes; observations are made of the teaching process; 

and outcomes of the two types of study material are measured. Such a design 

I 

can be employed only with regard to judgments related to future events. It 
can be used to examine the implementation of a proposed program but it can- 
not be used to examine the scientific accuracy and significance of the mate- 
rial presented in the curriculum material. Since this validation procedure 
is costly and tinie consuming it is not recommended for use in the normal 
course of curriculum development. Its use may be restricted to examining 
the validity of judgmental data in general. 



Summary 

Judgmental datck^are frequently used in social sciences, and their applica- 
bility to curriculum evaluation has been emphasized by experts. They can be 
produced at a low cost and at the early stages of curriculum evaluation when 
material is not yet ready for experimental tryout. Judgmental data may be used 
to evaluate the fit between the specification of the program and the planned or 
the written curriculum. It also can be used to predict the degree of implemen- 
tation of the program. The validity of judgmental data can be examined in dif- 
ferent ways including: consensus among experts, a shuffle test with regard to 
the original version of the curriculum material and the revised version, and' ex- 
perimental tryout of the original and revised program. 
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